AITopics | multiple source

0b9e57c46de934cee33b0e8d1839bfc2-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 15:34:38 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: Asia (0.14)

Industry: Leisure & Entertainment > Games > Computer Games (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

End-to-End Weak Supervision Carnegie Mellon University 2

Neural Information Processing SystemsFeb-7-2026, 12:07:10 GMT

Aggregating multiple sources of weak supervision (WS) can ease the data-labeling bottleneck prevalent in many machine learning applications, by replacing the tedious manual collection of ground truth labels. Current state of the art approaches that do not use any labeled training data, however, require two separate modeling steps: Learning a probabilistic latent variable model based on the WS sources - making assumptions that rarely hold in practice - followed by downstream model training. Importantly, the first step of modeling does not consider the performance of the downstream model. To address these caveats we propose an end-to-end approach for directly learning the downstream model by maximizing its agreement with probabilistic labels generated by reparameterizing prior probabilistic posteriors with a neural network. Our results show improved performance over prior work in terms of end model performance on downstream test sets, as well as in terms of improved robustness to dependencies among weak supervision sources.

artificial intelligence, downstream model, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(2 more...)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.88)

Add feedback

0b9e57c46de934cee33b0e8d1839bfc2-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 11:05:10 GMT

algorithm, joint distribution, return distribution, (14 more...)

Neural Information Processing Systems

Country: Asia > China > Hong Kong (0.04)

Industry: Leisure & Entertainment > Games > Computer Games (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

Boosting with Multiple Sources

Neural Information Processing SystemsDec-24-2025, 11:28:01 GMT

We study the problem of learning accurate ensemble predictors, in particular boosting, in the presence of multiple source domains. We show that the standard convex combination ensembles in general cannot succeed in this scenario and adopt instead a domain-weighted combination. We introduce and analyze a new boosting algorithm, MULTIBOOST, for this scenario and show that it benefits from favorable theoretical guarantees. We also report the results of several experiments with our algorithm demonstrating that it outperforms natural baselines on multi-source text-based, image-based and tabular data. We further present an extension of our algorithm to the federated learning scenario and report favorable experimental results for that setting as well. Additionally, we describe in detail an extension of our algorithm to the multi-class setting, MCMULTIBOOST, for which we also report experimental results.

algorithm, multiple source, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.79)

Add feedback

Mixture Weight Estimation and Model Prediction in Multi-source Multi-target Domain Adaptation

Neural Information Processing SystemsDec-23-2025, 22:16:54 GMT

We consider a problem of learning a model from multiple sources with the goal to performwell on a new target distribution. Such problem arises inlearning with data collected from multiple sources (e.g.

mixture weight estimation, multi-source multi-target domain adaptation, weight estimation and model prediction, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.96)

Add feedback

Theory and Approximate Solvers for Branched Optimal Transport with Multiple Sources

Neural Information Processing SystemsDec-23-2025, 16:40:43 GMT

Branched optimal transport (BOT) is a generalization of optimal transport in which transportation costs along an edge are subadditive. This subadditivity models an increase in transport efficiency when shipping mass along the same route, favoring branched transportation networks. We here study the NP-hard optimization of BOT networks connecting a finite number of sources and sinks in $\mathbb{R}^2$. First, we show how to efficiently find the best geometry of a BOT network for many sources and sinks, given a topology. Second, we argue that a topology with more than three edges meeting at a branching point is never optimal. Third, we show that the results obtained for the Euclidean plane generalize directly to optimal transportation networks on two-dimensional Riemannian manifolds. Finally, we present a simple but effective approximate BOT solver combining geometric optimization with a combinatorial optimization of the network topology.

branched optimal transport, name change, theory and approximate solver, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

HF-RAG: Hierarchical Fusion-based RAG with Multiple Sources and Rankers

Santra, Payel, Ghosh, Madhusudan, Ganguly, Debasis, Basuchowdhuri, Partha, Naskar, Sudip Kumar

arXiv.org Artificial IntelligenceSep-4-2025

Leveraging both labeled (input-output associations) and unlabeled data (wider contextual grounding) may provide complementary benefits in retrieval augmented generation (RAG). However, effectively combining evidence from these heterogeneous sources is challenging as the respective similarity scores are not inter-comparable. Additionally, aggregating beliefs from the outputs of multiple rankers can improve the effectiveness of RAG. Our proposed method first aggregates the top-documents from a number of IR models using a standard rank fusion technique for each source (labeled and unlabeled). Next, we standardize the retrieval score distributions within each source by applying z-score transformation before merging the top-retrieved documents from the two sources. We evaluate our approach on the fact verification task, demonstrating that it consistently improves over the best-performing individual ranker or source and also shows better out-of-domain generalization.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.02837

Country:

Europe (0.94)
Asia (0.70)
North America > United States (0.47)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Survey of LLM $\times$ DATA

Zhou, Xuanhe, He, Junxuan, Zhou, Wei, Chen, Haodong, Tang, Zirui, Zhao, Haoyu, Tong, Xin, Li, Guoliang, Chen, Youmin, Zhou, Jun, Sun, Zhaojun, Hui, Binyuan, Wang, Shuo, He, Conghui, Liu, Zhiyuan, Zhou, Jingren, Wu, Fan

arXiv.org Artificial IntelligenceJun-3-2025

The integration of large language model (LLM) and data management (DATA) is rapidly redefining both domains. In this survey, we comprehensively review the bidirectional relationships. On the one hand, DATA4LLM, spanning large-scale data processing, storage, and serving, feeds LLMs with high quality, diversity, and timeliness of data required for stages like pre-training, post-training, retrieval-augmented generation, and agentic workflows: (i) Data processing for LLMs includes scalable acquisition, deduplication, filtering, selection, domain mixing, and synthetic augmentation; (ii) Data Storage for LLMs focuses on efficient data and model formats, distributed and heterogeneous storage hierarchies, KV-cache management, and fault-tolerant checkpointing; (iii) Data serving for LLMs tackles challenges in RAG (e.g., knowledge post-processing), LLM inference (e.g., prompt compression, data provenance), and training strategies (e.g., data packing and shuffling). On the other hand, in LLM4DATA, LLMs are emerging as general-purpose engines for data management. We review recent advances in (i) data manipulation, including automatic data cleaning, integration, discovery; (ii) data analysis, covering reasoning over structured, semi-structured, and unstructured data, and (iii) system optimization (e.g., configuration tuning, query rewriting, anomaly diagnosis), powered by LLM techniques like retrieval-augmented prompting, task-specialized fine-tuning, and multi-agent collaboration.

artificial intelligence, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2505.18458

Country:

Europe (0.67)
Asia (0.67)
North America > United States > Minnesota (0.27)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Promising Solution (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Education > Educational Setting > Online (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)

Add feedback

On Learning Domain-Invariant Representations for Transfer Learning with Multiple Sources

Neural Information Processing SystemsJan-19-2025, 11:46:54 GMT

However, it seems not the case for the multiple source DA and domain generalization (DG) settings which are remarkably more complicated and sophisticated due to the involvement of multiple source domains and potential unavailability of target domain during training. In this paper, we develop novel upper-bounds for the target general loss which appeal us to define two kinds of domain-invariant representations. Finally, we conduct experiments to inspect the trade-off of these representations for offering practical hints regarding how to use them in practice and explore other interesting properties of our developed theory.

learning domain-invariant representation, multiple source, transfer learning, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.40)

Add feedback

Boosting with Multiple Sources

Neural Information Processing SystemsJan-16-2025, 22:59:13 GMT

We study the problem of learning accurate ensemble predictors, in particular boosting, in the presence of multiple source domains. We show that the standard convex combination ensembles in general cannot succeed in this scenario and adopt instead a domain-weighted combination. We introduce and analyze a new boosting algorithm, MULTIBOOST, for this scenario and show that it benefits from favorable theoretical guarantees. We also report the results of several experiments with our algorithm demonstrating that it outperforms natural baselines on multi-source text-based, image-based and tabular data. We further present an extension of our algorithm to the federated learning scenario and report favorable experimental results for that setting as well.

algorithm, experimental result, multiple source, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.87)

Add feedback

Filters

Collaborating Authors

multiple source

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

0b9e57c46de934cee33b0e8d1839bfc2-Paper.pdf

End-to-End Weak Supervision Carnegie Mellon University 2

0b9e57c46de934cee33b0e8d1839bfc2-Paper.pdf

Boosting with Multiple Sources

Mixture Weight Estimation and Model Prediction in Multi-source Multi-target Domain Adaptation

Theory and Approximate Solvers for Branched Optimal Transport with Multiple Sources

HF-RAG: Hierarchical Fusion-based RAG with Multiple Sources and Rankers

A Survey of LLM $\times$ DATA

On Learning Domain-Invariant Representations for Transfer Learning with Multiple Sources

Boosting with Multiple Sources